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Abstract: We describe how long-term solar system orbit integration could be implemented on 
a parallel computer. The interesting feature of our algorithm is that each processor is assigned 
not to a planet or a pair of planets but to a time- interval. Thus, the 1st week, 2nd week,. . . , 
1000th week of an orbit are computed concurrently. The problem of matching the input to the 
(n + l)-st processor with the output of the n-th processor can be solved efficiently by an iterative 
procedure. Our work is related to the so-called waveform relaxation methods in the computational 
mathematics literature, but is specialized to the Hamiltonian and nearly integrable nature of solar 
system orbits. Simulations on serial machines suggest that, for the reasonable accuracy requirement 
of 1" per century, our preliminary parallel algorithm running on a 1000-processor machine would 
be about 50 times faster than the fastest available serial algorithm, and we have suggestions for 
further improvements in speed. 



Version of 2 nd May 1996 



1 



1. INTRODUCTION 

Planetary orbits have now been studied over times of 10 8 -10 10 yr using various techniques, and 
exploring progressively longer time scales always seems to reveal interesting new phenomena. Laskar 
(1989, 1990, 1994) used a semi-analytic secular perturbation theory to follow planetary orbits for 
up to 10 Gyr, and found weak chaos in the orbits of the inner planets, which led for example to 
significant evolution in Mercury's orbit over the lifetime of the solar system. Sussman &: Wisdom 
(1992) directly integrated the orbits of the nine planets for 100 Myr, verifying Laskar's discovery 
that the orbits of the inner planets are chaotic, and also finding weak chaos in the orbits of the outer 
planets which did not appear in Laskar's secular theory. Laskar & Roboutel (1993) and Touma & 
Wisdom (1993) by different methods found that Mars's obliquity is chaotic, whereas the Earth's 
is apparently stabilized by the Moon. Naturally, long integrations are not ephemerides but probes 
of qualitative and statistical properties of the orbits. We remark that it is possible to extract 
information about dependence on initial conditions from Fourier analysis of a single integration 
(Laskar 1993). 

Since the longest direct orbit integrations are still much shorter than the age of the solar 
system, there is motivation to invent better numerical integration algorithms. The best long-term 
accuracy is achieved by methods that exploit (a) the Hamiltonian, and (b) the nearly Keplerian 
properties of the equations. Such methods were introduced by Wisdom & Holman (1991; hereafter 
WH91) and independently by Kinoshita, Yoshida & Nakai (1991), with subsequent improvements 
by various authors; for brevity, we will call them 'generalized leapfrog' in this paper, because of 
their formal resemblance to leapfrog. An n-th order generalized leapfrog integrator can have global 
error of 0(e 2 r n T), where e < 10~ 3 is the relative scale of planetary perturbations, r is the timestep, 
and T the total length of an integration. The e 2 factor and the linear dependence on T (rather 
than quadratic, as for general purpose integrators) make low-order schemes worthwhile. The state 
of the art demands an accuracy of ~ 10~ 9 in the frequencies (or ~ 1" per century for Mercury), 
which is achievable by second-order generalized leapfrog schemes with r ~ lweek — see Wisdom, 
Holman & Touma (1994) or Saha & Tremaine (1994; hereafter ST94). But a 5 Gyr integration 
remains prohibitively expensive at current floating-point speeds. 

This situation prompts one to think about parallel integration methods, and in this paper 
we develop a parallel integrator. We adopt a very different sort of parallelism from galactic or 
cosmological parallel codes; instead of distributing the force calculation among different processors, 
we distribute different time slices among different processors. Solar system dynamics invites such 
'timelike' rather than the usual 'spacelike' parallelism because orbits evolve very slowly compared to 
orbital times, whereas the number of forces that could be distributed between processors is relatively 
small. The new algorithm is very similar to second-order generalized leapfrog in accuracy, and is 
better at controlling roundoff error; but it is arithmetically more cumbersome, so — at least in the 
present version of a parallel integrator — the speed gain possible is smaller than the number of 
processors. 
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2. A PARALLEL ALGORITHM 

Consider some system of ordinary differential equations z = f(z), or equivalently, a set of quadra- 
tures: 

z(t)=z(0)+ f f{z{t'))dt>. (1) 

J 

We can approximate the quadratures by sums; discretizing time with timestep r and writing z n to 
mean z(nr), we can write, for example: 

n-l 

Zn = Z +T ^ f (k( Z m + ^m+l)) ! H = 1 . .N. (2) 

m=0 

This can now be regarded as a set of nonlinear equations in many variables (iV times the dimen- 
sionality of z), which can be solved by some iterative process. In an TV-processor computer, the 
right side of equation (2) can be evaluated in parallel, with each processor computing one term 
and 0(log./V) sums. If the number of iterations is much less than the number of processors, then 
parallelism will yield a speed gain. 

Solving the huge set of equations (2) in <C -/V iterations may seem a hopeless task. But in fact 
it can often be accomplished (provided the original differential equations are very smooth and there 
is a good starting guess for the z n ) by Newton-Raphson or variants of it not requiring derivatives 
(see e.g., Ortega & Rheinbolt 1970). The general idea is known as waveform relaxation; we suggest 
Bellen & Zennaro (1989) as an introduction to the literature in this area. In this paper, however, 
we will proceed from first principles, because for nearly integrable Hamiltonian systems the solution 
becomes quite simple. 

For a Hamiltonian system with canonical variables z = (p, q) the particular quadrature for- 
mula in equation (2) amounts to integrating the differential equations with the 'implicit midpoint' 
integrator 

d 

Pi = Po - t ^- H U(z + z{j) , 

9i = lo + r-^H {\{z + z x )) , 

with timestep r (which can be kept fixed when integrating planetary orbits) . This is a second order 
symplectic integrator, which is to say that the transformation zq — > z\ differs from the true time 
evolution by 0(r 3 ) but is exactly canonical. It is one of a class of symplectic integrators derived 
by Feng (1986). In the Appendix we derive a somewhat stronger statement: the integrator exactly 
follows the dynamics of a 'surrogate' Hamiltonian 

H = H + H erY , (4) 

which differs from the exact Hamiltonian H by an error Hamiltonian H err which is 0(r 2 ) and 
non-autonomous (H err is periodic in time with period r). In any case, the symplectic property 
guarantees that spurious dissipation will not occur. 

Consider now an H that is close to some integrable Hamiltonian, with p, q being action-angle 
variables for the latter, thus: 

H = H^(p)+eH^(p, q ,t). (5) 
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The formula (2) for such a system becomes 

n-l 

m=0 
n-l 

q n = qo + T^2 

m=0 

The error Hamiltonian is 0(er 2 ) (see Appendix). The error has the same order as in second- 
order generalized leapfrog, suggesting that the performance will be similar. Generalized leapfrog 
is preferable for serial integrations because it is explicit, but as part of an iterative parallel scheme 
explicit methods offer no advantage, and implicit midpoint leads to a simpler formulation. 

We now need to solve the equations (6) for the Pniqn- 

Let us abbreviate the equations as 

Pn-Po- u(P n ,Q n ) = 0, 
q n -qo~ V(P n ) - v(P n , Q n ) = 0, 

where u and v are 0(e) smaller than V, and P n , Q n are the vectors (po--Pn), (<7o- Suppose 
that P n ,Q n are currently at some good guess values for the exact solutions P n ,Q' n . We start out 
as if to derive a Newton- Raphson iteration, and expand (7) to leading order in z n — z' n : 

Pn-P0- u{Pn,Qn) - [p' n ~ P0 ~ U (K, Q'n)] + Pn ~ Pn 

— (terms involving derivatives of u) 
q n -q - V(P n ) - v(P n ,Q n ) ~ [q' n - q - V(P' n ) - v(P n , Q' n )\ + q n - q' n 
~ El=i(P™ ~ Pm)d/dPmV(Pn) — (terms involving derivatives of v) 

The terms in square brackets are zero since z' solves (7). We then substitute V(P n ) — V(P n ) for 
^ =1 (p m — Pm) (9/dPm) V{P n ) and neglect derivatives of u and v (thus departing from Newton- 
Rap hson and sacrificing quadratic convergence), to obtain 

Pn <- P0 +u{P n ,Q n ), 
q'n^q* + V{p'l)+v{Pn,Qn): 

Once in the neighborhood of the correct solution, the simple iteration (9) will give linear 
convergence by a factor 0(e) per iteration. The obvious starting guess to use is the solution to the 
unperturbed equations, which is trivial since p, q are action-angles Better guesses are possible, such 
as the mean semi-major axes and the eccentricities and inclinations given by secular perturbation 
theory; however, our tests with toy problems suggest that these will not lead to dramatic reductions 
in the number of iterations required for convergence. Note that making N excessively large will 
not introduce the danger of divergence of iterations, because the sequence of approximants to z n 
does not influence the approximants at any z m<n . Provided the iterative process converges for the 
serial formula (3), the worst that can happen in the parallel case is that the convergence becomes 
proportional to N. 



^ q H^(\{z m + z m+1 )), 

^ {\{p m + Pm+1 )) + e A tfd) (± {Zm + Zm+l) ) 



(6) 



(7) 



(8) 
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3. AN SIMPLE EXAMPLE 

It is helpful to apply the iterative procedure of the previous section as applied to a simple example. 
We consider the pendulum Hamiltonian 

H = \p 2 - ecosq. (10) 

For this case, because integrals of the type (1) can be done exactly, we will leave them as integrals 
rather than replacing them by sums. We also make one change of notation for this section: p k and 
q k will refer to the k-ih approximation to p(t) and q(t), and not to values on a grid. 

With these changes the iterative scheme (9) becomes 

Po(t) = p(0), 

?o(t) =?(0)+p(0)t, 



(13) 



p fc+ i(t) = p(0) - e [ sm[q k (t')} dt' , (H) 
J o 

9fc+i(t) = ?(0) + / Pk+ i(t')dt'. 
Jo 

We can derive expressions for the errors in the n-th approximation 

<5p fe (i)=Pfc(*)-p(i), Sq k (t)=q k (t)-q(t), (12) 
by comparing the iterates (11) with the exact equations 

p(t)=p(0)-e / sm[q(t')]dt', 
Jo 

q (t)=q(0)+ [ p{t')dt'. 
Jo 

Assuming \Sq k \ <C 1 and hence sin[(7fc(t)] ~ sin[g(t)] for all k > 0, we get: 

Spo(t) = e [ sm[q (t')]dt', 
Jo 

Sqo(t)= [ 5 Po (t')dt', 
Jo 

5p k+ i{t) = -e / cos[g (O] Sq k {t') dt' , 

JO 

%+i(i)= / 5p fc+1 (t / )^'- 
Jo 

These equations can be solved analytically, although the expressions for Sp k {t) and Sq k (t) rapidly 
become quite complicated as k grows. For large values of et we find that to leading order the even 
iterates satisfy 

!<«')! = + % {W)) 2k+1 +0{t2k ~ ly ' l6p2k(t)l = ^f^l^WI+o^" 1 )- ( 15 ) 



(14) 
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Figure 1. Diagnostics from a numerical solution for the pendulum Hamiltonian (10) with e = 0.01, 
p(0) = 1, q(0) — 0, integrated using the algorithm of Section 2, with timestep t = 0.1. The four curves are 
the number of iterations required for convergence to 10 -4 , 10 -6 , 10 -8 , and 10 -10 . The corresponding plot 
for t — 0.01 is almost identical. 



Thus the iterative scheme will converge for all et. For small et the number of iterations required 
will be roughly n it oc — l/ln(ei). For et S> 1, we expect that if p(0) = 1, then n\ t oc et iterations 
will be required for convergence. 

We plot in Figure 1 the number of iterations taken by the scheme (9) to converge to various 
levels as a function of time. This case differs in detail from the one we have just analyzed, because 
it refers to a discretized situation as in Section 2; this means that the iterations are converging not 
to the true solution, but to a surrogate differing from the true solution by a timestep-dependent 
amount. Also, the approximation sin[</fc(i)] ~ sin[g(t)] we made above is not always valid. Nev- 
ertheless, Figure 1 agrees with our qualitative prediction of number of iterations oc et for large et; 
quantitatively, the number of iterations is pa 4et. 
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4. SURROGATES, AND WARMUP 

If a generalized leapfrog integrator is applied to a nearly integrable Hamiltonian, the output can 
be expressed formally as the time evolution under the surrogate Hamiltonian 

H = H + er n x (power series in e and r 2 ), (16) 

for an n-th order scheme. We say 'formally' because the series in equation (16) does not in general 
converge; nevertheless, analysis of this series is useful. In particular, the er™ indicates long term 
errors of 0(er n T). It turns out that one can eliminate the effect of secular 0(e) terms in H 
by applying a special small transformation to either the initial conditions or the output, thus 
reducing the errors to 0(e 2 r n T). Various transformations which accomplish this are given in Saha 
& Tremaine (1992), WHT94, and ST94. Probably the simplest comes from the latter paper, and 
consists of the following 'warmup' to the main integration. 

( Integrate backward (or forward) for a large number of orbits with the timestep r 
scaled down by < e 1 /™ while linearly reducing the perturbation strength to zero) 

( Integrate forward (or backward) to the starting point with the regular timesteps r, 
while linearly reviving the perturbation strength to the correct value ) 

In the Appendix, we show that errors in the implicit midpoint integrator also can be described 
by a surrogate Hamiltonian, which differs from H by an error Hamiltonian that is 0(er 2 ) if H is 
nearly integrable. Thus the above warmup procedure will reduce long-term errors from 0(er 2 T) 
to 0(e 2 r 2 T) whether we use generalized leapfrog or implicit midpoint. In fact, the results from 
the implicit midpoint method, both without and with warmup, are very similar to those from the 
corresponding generalized leapfrog method. 



5. A SIMULATION WITH NINE PLANETS 

To apply the new integration method to planetary orbits, we must: 

(i) reduce the Hamiltonian to the form H m + ei? (1) , where now H (0) is Keplerian and e represents 
the smallness of planetary perturbations; 

(ii) find action-angle variables p, q for H (0) ; 

(hi) find expressions for (dH m /dp) and (dH (1) /dq). 

Item (i) is standard, though not trivial. Barycentric variables will not do because the Sun's 
motion is not nearly Keplerian. The key (see WH91) is to take Mercury's coordinates relative to 
the Sun, Venus's relative to the barycenter of Sun-Mercury, and so on (the ordering of planets is 
immaterial). The time derivatives of such coordinates — called the Jacobi coordinates — turn out 
to be proportional to the conjugate momenta. The Sun's motion drops out and is replaced by 
the ignorable motion of the solar system barycenter. The Hamiltonian has the desired form, with 
one Keplerian term for each planet plus perturbation terms of O(e). Post-Newtonian effects from 
general relativity add a further complication; a procedure for handling these effects is described in 
ST94. 
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For item (ii), we transform the Jacobi variables for each planet to Poincare elements A, A, 
£i>£2) ?7i , 772 - These relate to the more familiar Keplerian elements a, e, i, Q, oj, M as follows: 



A = 

6 = 
6 = 
m = 
m = 



y/JIa, A = Q + 00 + M, 



2A(1 - v 7 ! - e 2 ) cos(Sl + w) ~ \/Aecos(Sl + u;), 
1 

2A(1 - \A - e 2 ) 2 sin(^ + u;) ~ -VAesm(Q + u), 

1 

2Aa/ 1 — e 2 (l — cos i) cos$7 ~ \/Aicosr2, 

1 

2Ai/ 1 — e 2 (l — cosi) 2 sin f2 ~ — y/Xi sin 17. 



(18) 



The first two are action-angles of H (0> ; the other four are not action-angles but they are canonical, 
which is good enough since H m involves only the A of each planet. Poincare elements remain 
non-singular for zero e or i. 

Item (iii) consists of transforming the accelerations to Poincare variables. This is straightfor- 
ward with the chain rule, but boring to program and expensive to compute. With nine planets, we 
found that transforming accelerations from cartesian Jacobi variables to Poincare elements took 
more than four times as long as evaluating the accelerations in the first place. 




1000 10 4 10 5 10 6 

Time in yr 

Figure 2. Maximum error in mean anomaly M up to time T, against T, with a timestep of 7^ days. 
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We have programmed the algorithm to simulate its operation on a parallel computer, and 
tested the results of a medium-length integration. We used a timestep of 7^ days and a warmup 
length of 5000 yr with timestep scaled down by 32. Our single actual processor was simulating 4096 
concurrent ones. Figure 2 shows the estimated errors in our integration, gotten by comparing with 
a more accurate version of the integration by Quinn, Tremaine & Duncan (1991). It is quite similar 
to Figure 2 in ST94 which shows error estimates for generalized leapfrog with the same timestep. 




Figure 3. The average number of iterations required for convergence (to ~ 50 bits of precision) versus 
the number N of simulated processors. The two panels show the same data using linear and logarithmic 
scales for N. 



In Figure 3 we show the average number of iterations n;t our algorithm needed to converge to 
~ 50 bits of precision (relative error ~ 10~ 15 ), over blocks of steps of different lengths N (in effect, 
different numbers of processors). The plots exhibit the linear dependence at large N predicted by 
the simple model of Section 3, and can be fitted by 

n it ^ 6 + f ) for N > 500. (19) 

V 1000 7 V ; 

While the above results demonstrate that our algorithm works, our current implementation is 
probably far from optimal. In particular, our program does about 10 times as much arithmetic per 
iteration as generalized leapfrog with individual timesteps to complete an integration of the same 
length. The picture changes, though, when we consider the control of roundoff errors. We consider 
this and other prospects for improvement in the concluding section. 
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6. PROSPECTS 

A serious obstacle to accurate long solar system integrations is roundoff error (e.g. Quinn & 
Tremaine 1990). Roundoff errors generally act like spurious dissipation, introducing secular drift 
in the energy, and hence positional errors oc T 2 . More precisely, if 5 is the floating point precision, 
and P the orbital period, the expected longitude error after time T is A A « T 2 5/(tP). For a 5 Gyr 
integration with a 1 week timestep, at least 76 bits are needed to keep the roundoff contribute to 
Mercury's longitude error below 0.01 radians. (Of course, 0.01 radians is much smaller than the 
expected integration errors, which are inflated by weak chaos, but it is good to set low tolerances 
for roundoff error, because it is an unphysical effect.) IEEE double precision provides a mantissa of 
only 53 bits. Many compilers offer quadruple precision, but that is implemented in software, and 
is slower by a factor of 20 at least. 

Our new method offers a significant improvement in effective precision with only a small 
sacrifice of speed. The reason is that the perturbative impulses (u and v in equation [9]) are by 
far the major computational cost. For a 1 week timestep and e ~ 10~ 3 , these impulses are 10~ 4 
smaller than p, q; hence they require ~ 13 bits less precision. So by doing the force calculation in 
double precision and the few remaining operations in quadruple precision, one can effectively gain 
~ 13 bits of precision. 

Also, some floating point units (including the Intel n86 series and the Motorola 68n series) use 
80 bits internally and compilers for these may offer an extended double precision type with a 64-bit 
mantissa. On such hardware, we expect that an effective precision of ~ 77 bits will cost < 20% 
more time than IEEE double precision. 

Similar games can be played with other integrators, of course, but generally with less success. 
In multistep integrators (see Quinlan & Tremaine 1990) force includes the solar part, so the gain in 
effective precision is less. In generalized leapfrog, there is a large subset of operations not involving 
small quantities, and when done in quadruple precision these slow the method ~ 6 fold. 

In summary, when roundoff is considered, we estimate that the parallel algorithm running on 
1000 processors will be about 50 times faster than generalized leapfrog (as implemented in ST94) 
on a single processor. 

To close this paper, we mention some avenues which may lead to more efficient parallel algo- 
rithms. 

1) Rather than devote 1 processor to each of N timesteps, it might be better to set K processors 
working in parallel on each of N/K timesteps. Let us assume (i) that using K processors 
reduces the time needed to complete a timestep by a factor K/g where g > 1 represents the 
degree of parallelism (g will generally be an increasing function of K), and (ii) that the number 
of iterations required for convergence at timestep m is n(m) = cm + no, where c and n are 
constants (cf. eq. 19). Then parallelizing each timestep is faster if the number of processors 
N > n K(g — l)/[c(K — g)]; for K S> 1 and the parameters of equation (19) this yields 
N > 6000(5 - 1). 

2) The iterations converge much more quickly near the start of a block of timesteps than near 
the end. Thus the early processors become free while the later ones are still converging. These 
freed processors could immediately start working on the next block of timesteps, using the 
partial converged results from the previous block as a starting point. 

3) In our implementation, we have not paid much attention to algebraic simplifications, using the 
chain rule to convert accelerations from cartesian variables to Poincare elements. It would be 
worth some effort to try to recast the change of variables, for the chain rule is an arithmetic 
glutton. An intriguing possibility is that one might be able to dispense with the Poincare 
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variables, and simply use the method on the variables a, A, e cos(Q + u), e sm(Q + u), i cos CI, 
isinfi; the accelerations would be just the right sides of Gauss's equations for these variables, 
which are relatively simple to evaluate. Now, the latter variables are not canonical, so the 
integrator will not be symplectic; but it will still be time-symmetric, and possibly that is good 
enough. An alternative approach would be to design an iteration scheme based on a generalized 
leapfrog integrator rather than the implicit midpoint method. 

4) Our iteration scheme (9) is only linearly convergent. It would be interesting to seek an efficient 
quadratically convergent iteration scheme. 

We thank Wayne Hayes, Matt Holman, Agris Kalnajs, and George Lake for discussions. PS 
is grateful to the University of Washington for their hospitality. This research was supported by 
NSERC. 



APPENDIX: THE IMPLICIT MIDPOINT METHOD 

This Appendix derives the error Hamiltonian associated with the implicit midpoint method (3), 
following Feng (1986). 

Let z and w be vectors of the type {p\. .px, Qi- -Qk) in phase space. In the following we will 
use matrix notation: z denotes a column vector, z T the corresponding row vector, and so on; J is 
the usual symplectic matrix. 

Consider a map that takes z(0) to z(i), through a generating function. We write 

z = z(t), z = z(0), ^ 
w=i(z + z ), (f) = <f){w,t), 

and subscripts for gradients and time derivatives, like (fr w and 4> t . The map is 

z - z = J"Vw, (A2) 

(</> w being a gradient). This map is canonical. 1 

Now take the time derivative of (A2), which yields 

(I - \ J" Vww) z = J" 1 Kt (A3) 

Let us now define a function H(z, t) through 

<Mw,t) = H(z,t), z = w-±J0 w , (A4) 



1 There are several ways of proving that maps of the type (A2) are canonical, and here is one. If x and y are 
two phase space vectors, then the map x — > y will be canonical if x T J dx — y T J dy is an exact differential. Let 
w = i (x + y) , and suppose that 

i(x T Jdx-2/ T Jdy) = d0 + d(w T Jx), 
for some 4>{-w,i). Substituting for dy, gives 

(x — y) T J dw = dcj>, 

and hence 

y - x = J~Vw, 

which is of the form (A2). 
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the latter relation being a consequence of equation (A2). Taking the gradient of equation (A4) and 
rearranging gives 

(1-ij-^ww) J-'H Z = J"Vwt, (A5) 
and comparing this with equation (A3) gives 

z = J- 1 H z (z,t). (A6) 

In other words, the map (A2) is equivalent to evolution under the surrogate Hamiltonian H. 
Equation (A4) relating the generating function and the surrogate Hamiltonian is a Hamilton-Jacobi 
equation. 

To derive the implicit midpoint method for a system with Hamiltonian H, we set </>(w, t) = 
tH(w); with this choice equation (A2) evaluated at t = r is identical to equations (3) except for 
minor differences in notation. The surrogate Hamiltonian for the implicit midpoint method with 
timestep t is then defined implicitly by equation (A4), which reads 

H[w- |iJF w (w)] = H(w). (A7) 

We may write H = H + H err , where the error Hamiltonian may be expanded in a power series, 

oo 

H eiI = J2Krrt n - (A8) 

71=1 

By substituting this expansion in (A7) the functions H^ rY , H^ rr , etc. can be determined in succes- 
sion; for example, assuming H is autonomous we have 

2K 

H l„ = 0, Hl rr = | ^2 HiJijHjkJkiHi, (A9) 

i,j,k,l=l 

where Hi = dH/dwi, Hij = d 2 H/dwidwj. Of course, if the system is to be followed over more 
than one timestep, the factor t n in equation (A8) should be replaced by [i(mod r)] n . 

For a nearly integrable Hamiltonian of the form (5), the error Hamiltonian becomes 

12^ SHt°'< 9 2 tr<'> 8H<°) , 

I - = ii '| 1 ¥9S¥ t ° ft (A10) 
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